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: caUy effecting electronic camera movemeot to 
I track and display the location of a moving ob^ 
I ject, such as a person presenting a talk to an au- 

dience. A fixed spotting camera (1 10) is used to 

capture a field of view, and a moving tracking 
! camera (120) with pan/tilt/zoom/focus function! 

IS dnven (controUer 520) to the present location 

of the moving objecL Infonnation for driving 

the tracking camera is obtained with reference 

!?^f P"*^' difference between a current image 

(3U0) and a previous image (200) within the 

field of view. A tracking algorithm computes 

tue information necessary to drive the tracking 
camera from these pixel differences as wcU as 
data reUtive to the ficid of view of the spotting 
camera and the present tracking camera posi! 
tion. *^ 
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AUTOMATIC TRACKING CAMERA CONTROL SYSTEM 

Field of the Invention 

This invention relates generally to caniera control systems and, more 
specifically, to a two-camera system which automatically tracks movement of an objc 
5 in the image space under control of a tracking algorithm. 

Background of the Invention 

Audio/visual presentations given in a corporate setting are seldom 
attended by as many people as would like to see and hear such presentations. It is oft' 
inconvenient to travel to where the talk is given and one must be free at the appointed 

10 hour. Televising talks over a video network and/or making the video tape available fc 
later viewing are often viable alternatives for a corporation having many geographical- 
dispersed work locations. 

Installation of commercial-grade television equipment in a large meetir 
room or auditorium can transform such a location into a simple and cost effective 

15 television studio. If the equipment is not overly elaborate nor difficult to operate, then 
only one person can do the work usually assigned to two or more trained personnel. A 
result, it is economical and very convenient to record and telecast presentations made 
that room so they can be seen and heard at other locations, and even at different times 
desired. 

20 One weakness of the one-operator system is that a person who walks 

around during theii presentation can present a significant work load to the system 
operator who must keep up with the movement of the person. This extra work load 
becomes a distraction from the system operator's principal task of presenting the mor 
appropriate image to the remote audience or to the recording medium. 

25 The prior art is devoid of teachings and suggestions for a video systen 

wherein a camera arrangement can track a presenter who paces and/or gesticulates, a 
thereby provides to the system operator another image which may be appropriately 
selected for immediate display or recording for later replay. 

Summary of the Invention 
30 Instead of requiring the operator to follow the presenter by physically 

controlling the movement of the camera so as to make available an image at the syst. 
control console which may then be displayed to the audience or captured on a recorc 
medium, the technique in accordance v^ath the present invention effects camera 
movement automatically and the operator merely selects the display image as one c 
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FIG. 6 depicts a typical spotting image detected by the spotting camera; 

FIG. 7 depicts the typical locations of a Threshold Box, a Maximum 
Search Box, a Cuixent Search Box, and a Block Box within the spotting image; 

FIG. 8 depicts a flow diagram of the tracking algorithm in accordance 
5 with the present invention; 

FIG. 9 depicts the occurrence of a Bounding Box appearing totally within 
the Current Search Box and outside any Block Box; 

FIG. 10 depicts the occurrence of adjacent Sub-Bounding Boxes, one 
within a Block Box and the other within the Current Search Box but outside any Block 
10 Box; 

FIG. 1 1 depicts the occurrence of spaced-apart Sub-Bounding Boxes one 
witiiin a Block Box and the other within the Current Search Box but outside any Block 
Box; 

FIG. 12 depicts the arrangement of the Current Search Box, the Tracking 
15 Frame, and the Camera Frame; and 

FIGS. 13 and 14 depict the direction of movement for the edges of the 
Current Search Box relative to the Camera Frame and Maximum Search Box for the 
cases of a high Confidence Measure and low Confidence Measure, respectively, as to the 
location of the person. 

Detailed Description 

The basic characterisrics of the operating environment where the 
automatic camera control system is used are as follows; 

(1) the object being tracked, such as a person, is the only object likely to be moving 
in the range of potential scenes, except for pre -determined areas which wiU have morion 
but where such motion in isolation is not of interest; 

(2) only one object is to be tracked, or multiple objects which generally £U"e in 
motion simultaneously are to be tracked; and 

(3) failure to stay "on target" is not a serious flaw, provided it does not happen too 
often nor persist too long. 

Tliese characteristics then impose some limitations on the operating 
environment, expressed as: 

(i) The background of the area of potential scenes must be highly static; for example, 
in the illustrative case, no drapes or plants constantly swaying, no windows or doorways 
that can see outside traffic. 

(ii) Objects which may move but are not of interest must be well separated from the 
area where the tracked object moves from the camera's point-of-view. For instance, the 
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members of the audience must be well separated from the area where a person giving the 
talk stands and moves. 

(iii) Areas of potential scenes which may exhibit motion and where the tracked 
object may also move can be identified and marked with a Block Box. As an example, a 
5 person giving the talk may walk in front of a projection screen where slides are shown 
and changed. 

Placement of the Camera System in a Room 

The diagram of FIG. 1 is an overhead view of room 100, typical of an 
environment where tracking camera system 109 in accordance with the present invention^ 
10 is used. Person 101 - the object or target of interest - is standing in the center of stage 
platform 102, in front of projection screen 103, and is facing audience area 105. Two 
cameras compose electronic camera tracking system 109, namely spotting camera 1 10 
and tracking camera 120. Typically, camera system 109 is suspended from the ceiling of 
room 100. 

15 Spotting camera 1 10 does not move and looks at the entire front of the 

room, as shown by the dotted lines emanating from camera 1 10 forming a solid viewing 
angle 1 1 L Camera 1 10 does not pan, tilt, zoom or change focus. 

Tracking camera 120 is remotely controllable and can be made to pan, tilt, 
zoom and focus by commands issued by a computer (not shown). These commands 

20 cause the image within the solid viewing angle 121, called the "tracking image," of the 
tracking camera to change and thereby tend to keep the moving person 101 in view. The 
focus function of the tracking camera may include an automatic focus capability. 

The image captured within angle 111, called the "spotting image," 
determines the coverage of the system and hence the range of possible views which may 

25 be seen by tracking camera 120. If person 101 leaves the spotting image the tracking 

camera will not be able to follow him or her outside of angle 111. As the system tends to 
keep the tracking image centered on the target, it is possible for part of the tracking 
image to view portions of the room which are outside of the spotting image when the 
target is near an edge of the spotting image. 

30 Finding the Moving Target 

The requirements that only person 101 moves and that everything else in 
the view of spotting camera 110 be stationary, combined with the sequential-image 
nature of television, can be effectively utUized to develop a camera control algorithm to 
"lock onto" person 101. 
35 A television moving picture is a series of static images, much like motion 
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picture film. Each television image is made up of lines of pixels where each pixel may 
be in any of a range of brightness values and color values which may be represented by 
numbers. By way of a hueristic discussion (this is an idealized description of the process 
- the actual algorithm is discussed in detail below), a non-moving television camera is 
5 considered. If one television image is captured and a second image is captured shortly 
thereafter, and then the image information of the second is subtracted from the first, all 
the pixels that represent objects which did not move will be the same and their difference 
will be zero. Those pixels differences whose brighmess or color values changed between 
the first and second image will have non-zero pixel differences. These pixel differences 
10 indicate both where the person was and the present location of the (moving) person 101. 
All pixels whose brightness and color values did not change from the first to the second 
image represent objects which are not apparentiy moving and will have pixel differences 
of zero. 

The sequence of images of FIGS: 2-4 depicts the effect. The pictorial 
15 information iUustrated by spotting image 200 of FIG. 2 represents the first or "previous" 
spotting image, and the pictorial information illustrated by image 300 of FIG. 3 
represents the second or "current" spotting image. "Difference" image 400 of HO. 4 
depicts the absolute value of the pixel differences and, since person 101 was the only 
object that moved, the double image 405 represents where person 101 was and is. All 
20 other objects in the spotting image did not move and the difference image does not 
represent them. 

In practice, the double image 405 itself is not used directiy. Instead, as 
die lines of pixels are processed and corresponding pixels are subtracted, the x and y 
coordinates of the highest, lowest, left-most, and right-most pixels which were non-zero 
25 are marked. These coordinates define die "Bounding Box" 410. The Bounding Box 410 
represents die differences of the double image 405 and is used by the tracking algorithm 
to determine how much panning, tilting, and zooming of tracking camera 120 is 
appropriate to drive it to tracking image position 420 where it acquires a close-up picture 
of person 101. 

30 Thus, based on the pixel differences of the previous spotting image 200 

and the current spotting image 300 taken by spotting camera 1 10, tracking camera 120 
captures the tracking image of person 101, as discussed in more detail below. 

The mapping between the location of the Bounding Box within Lhe 
difference image 400 (and hence within the current spotting image 200) and the 

35 commands sent to PanA'ilt/Zoom/Focus subsystem (presented below) of tracking 
camera 120 is based on knowing the settings which aim the tracking camera 120 to 
predetermined locations within the viewing angle 1 1 1 of spotting camera 1 10. TN-picalW 
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these predetermined locations may be the four comers of the spotting image 200 of 
spotting camera 110. These settings are determined during alignment procedures 
accomplished during system installation. Interpolation allows for the pointing of 
tracking camera 120 with some accuracy at person 101. 

System Architecture 

System architecture 500 of an illustrative embodiment in accordance with 
the present invention is depicted in FIG. 5, and the operation of system 500 is succincdy 
described as follows: 

(1) Video images (an exemplary image is shown by depiction 531) from spotting 
camera 110 are sequentially captured by Video Analog- to-Digital (A-to-D) 

converter 51 1 in computer 510, and stored in one of the frame buffers 512. 

(2) Processor 513 retrieves necessary data from a frame buffer 512, analyzes such data, 

and computes the action, based on a tracking algorithm discussed in detail below, to 
be effected by tracking camera 120. 
15 (3) As an optional side-effect, processor 513 may develop an image displaying 

information related to the tracking algorithm. That image is transferred into Video 
Digital-to- Analog (D-to-A) converter 514 via one of the frame buffers 512 and 
displayed on a video monitor (not shown); such image display information is shown 
by depiction 552. 

20 (4) The actions required of tracking camera 120 are communicated to 

Pan/Tilt/Zoom/Focus controller 520 as commands over interface bus 515. 
Controller 520 may sometimes respond with position data in response to the 
commands, and this position data is returned to processor 513 via interface bus 515. 

(5) Controller 520 translates those commands into drive signals to perform the Pan/Tilt 
25 control and the Zoom/Focus control; these drive signals are transmitted to tracking 

camera 120 via leads 521 and 522, respectively. The signal on lead 521 is delivered 
to PanyTilt Head 121 of tracking camera 120, whereas the Zoom/Focus signal on 
lead 522 is delivered to Zoom/Focus subsystem 122 of tracking camera 120. The 
Pan/nit head 121 may sometimes respond with position data in response to the 
commands, and this position data is returned to controller 520 via leads 521. The 
Zoom/Focus subsystem 122 may sometimes respond witii position data in response 
to the commands, and this position data is returned to controller 520 via leads 522. 

(6) PanAili head 121 and Zoom/Focus subsystem 122 respond to these control signals, 

driving the tracking camera to u-acking image position 420. The tracking 
35 camera 120 thus acquires a tracking image of the person 101 (depiction 553 is 

exemplary) which is transmitted via lead 123 for use. For example, lead 123 may go 
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to system control console 540 where it might become a camera image selected b) 
the operator for display or taping purposes. 
Discussion of the Operating Environment 

Before the Tracking Algorithm is described in detail, aspects of the 
5 operating environment necessary to undersunding the algorithm are discussed. 

A typical image taken by spotting camera 1 10, designated spotting 
image 600, is depicted in HG. 6. Of particular note is that members 105 of the aud.e: 
in region 605 are visible within the spotting image and that person 101 can walk .n frc 
of the projection screen 103. Both people in the audience and the projecdon screen nr- 
10 presem motion which should normally not be tracked. 

Particular areas of interest are designated within sporting image 600, 
called "boxes", as illustrated in HG. 7. Specifically, the following boxes are defin. 
Threshold Box 705, Maximum Search Box 710, and an optional Block Box 715. Ea. 
box is defined to the tracking algorithm after spotting camera 1 10 is installed in the ^ 
15 and remains constant. Defining the boxes is only required once in most mstaUat^or 
simations where the system must deal with very different environments, such as w.. 
the same room has radically different semps and uses, it wUl be necessary to define 
several sets of boxes and select the appropriate set for each use. Some uses may re 
no Block Box 715; others may require more than one Block Box 7 15. 
20 Threshold Box 705 is placed so that it covers an area of the unage w, 

is unlikely to see motion, such as the ceiling of the room. The area within Thresho 
Box 705 should be illuminated to approximately the same extent as the area withir 

Maximum Search Box 710. 

Maximum Search Box 7 10 defines the maximum area within the s? 
.5 image 600 where control system 109 will attempt to discern motion. Tlungs and p 
which are likely to move and are not to be tracked, such as members of the audienc 
should not be within the Maximum Search Box. In HG. 7, Maximum Search Box 
sized from about knee-height of person 101 to as high as the tallest person might r 
The bottom of Maximum Search Box 710 is well above the heads of the audience 
30 the heads and raised hands of the audience are unlikely to be within Maximum Se 

Box710. „ „ - 

Within the Ma^um Search Box 710 is th= Current Search Box 

•n.e current Search Box is the area within which Pixel Differences axe computed 
Changes in size as the algorithm runs, as described below. It is never larger than t 
35 Maximum Search Box 710. 

Block Box 7 15 defines an area at least partially withm the Maxima. 
Search Box 710 where there is likely to be motion which is usually, but not alway 
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c.Udes by pushing a button on 
, «;hen person lOl changes slides oy p Maximum 

„ T 1 '^ except when oui>^» 
5 Block Box 7 1 5, exc F ;c rectangular and 

defined: ^ i^els in an image scan line ^ ^^^^ 

Pixel Value- the mn . -,aiues range from zero vi „ Wnance pixel 

• .oe- Tvpicallylununancevaiu . typical maximum lumma 

« *e pU.>s ,,e scan «i* « 

""""r count - renumber of Pi«>s wi*>n a R=c-8 

50 Bounding Box - "f^^,, Cunen> T^=f _ B„,nding Box; such 
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entire area -e-ac.^^^^^^^^^ ./^^n/tUt head Ui so that the, 

30 (b) Mount, aUgn and adjust traclang velephoto (greatest 

in,age all areas of spottmg .mage 700 at them ^^^^^ 
magnification) zoom. ^= ^^-'^^ f ^o^ each other as possible. T>, 
camera 1 10 and tracking ^^-^^'^^^^^l^^ ecplanar and minimum pos. 
installation places the vertical lens ^"J J > J 
35 displacemem ben^een the horizontal lens axes, 
(c) Define Threshold Box 705 to the algonthm. 



M^ Define Maximum Search Bo ^^^^ 
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Processing Block 820: Compute Current Threshold. 

The Pixel Difference is a combination of actual image differences dr 
objects moving and video noise. The video noise may be contributed several sources 

typically including: 
5 (a) noise in the imaging detector within spotting camera 1 10, 

(b) noise in the imaging circuitry within spotting camera 1 10, and 

(c) noise in video Analog-to-Digital circuitry 511. 

If not accounted for, video noise could cause Pixel Differences which 
would appear to the algorithm as object motion. By calculating a Current Thresholc 

10 which is higher that the contribution of video noise to the Pixel Differences the alg 
can avoid mistakingly identifying video noise as motion. 

As part of the setup procedure, a box in the image of spotting camera 
is picked as the Threshold Box 705. This box is in a portion of the spotting image 
camera 1 10 which is illuminated approximately the same as the Maximum Search 

15 Box 710 but which is never expected to see motion. Therefore, any Pixel Different 
within the presumedly static region encompassed by the Threshold Box can be ass: 
to be due to video noise only. This video noise is assumed to be representative of i 
video noise which occurs within the Maximum Search Box. The Current Thresholc 

as a function of this video noise. 
20 For each pixel within the Threshold Box 7 10, calculate the Pixel 

Difference between the Current Image and the Previous Image. The largest value fc 
is the Current Noise Level. Set the Current Threshold as the Current Noise Level p 

the Threshold Bias. 

Processing Block 830: Determine Pixel Differences Between Current and 

25 Previous Images. 

Creating a double image 405 which is a good indication of where a 
person 101 is now depends on that person having moved recent enough and far eno 
for the difference between a Previous Image and the Current Image to create a 
representative Bounding Box 410. A means to strengthen that difference is to keep 

30 several Previous Images, by increasing age relative to the Current Image. If a stror. 
difference occurs between the newest Previous Image and the Cuiient Image, that i 
indication that there is a lot of recent motion and is sufficient to continue with this : 
If that recent difference shows only slight motion, such as when the person is mov: 
slowly or only moving one portion of their body, comparison between the next old-. 

35 Previous Image and xhc Current Image will create a double image 405 which will 5 
the motion that has occurred over a longer period of time and hence may present a 
stronger difference on which to base the rest of the algoritiim. 



wo 94/17636 ^ 



-12- 



pCTaJS94/00866 



c \^ area only within the Cuirent Search Box, 
When the person moves in an area oniy ^ , 

in HG. 10, the pomon of the d"""'' ™ « ^ 

ignored. When there is a S"^-«™"*"f,^;\^„7, adjacent to each 

Bounding Box 1010 outside the same Block Bo^^ 'J^^" ^ 

0 other, the two are nrerged creadng ^'^°'"^l^^ZZ Sub-Bounding Boxes, as 
Tf there is a significant distance between tne two q„, mns 

■ H b FIO U su h as when changing the sUde generates Sub-Boundmg Bo 1005 
depicted by HG. 11. such as wn lOlO in the outside the same Block 

in the Block Box in addition to ^"''■^-"''"f '"^^Box 410 used for tracking. 
Box715,Sub.BoundingBoxl0101^ornes.Bo"^^^^^^^^^ 

,5 Within the Current Search B« ™, 

Current Image and the corresponding Previous Image. If me Fix 
..theCurrent^M— 

, rrrc-t^^ho,d....batsu.^^^^^^^^^^^^^^ 

Threshold within that Sub-Bounding Box 1010. 

Bounding Boxes are within 1% of the Scan Line Length. 
Bounding Bo Determine the Tracking Frame. 

y t^ aspect ra.o of the trac^g camera image is 4-wide to 3- 
V ' not match the 4 3 aspect ratio, a compromise is required when 
high. Since a person does not match the a.^ abp 

Note that in me ope 6 k„ f„, a reasonable compromise 

person 101 is more appropriate to display than his or her feet A 
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is to position the top of the Tracking Frame 1210 proximate to the top of the Boundi 
Box (which usually corresponds to the top of the person's head) and make the Trackj 
Frame as wide as the Bounding Box, as depicted in FIG. 12 . 

However, because the algorithm is tracking the difference in the persoi 
5 position and not the outline of the person, the Bounding Box does not uniformly surroi 
the person's image. The algorithm smooths the size and position of Tracking 
Frame 1210 to help keep the tracking camera 120 from "jumping around". Many 
algorithms may be readily devised by those with skill in the art, such as averaging, fo 
smoothing the position of the Tracking Frame. The intent of smoothing the Tracking 
10 Frame is to both keep it positioned on person 101 , appropriately sized, and to respond 
quickly when the person moves. The smoothing algorithm may also choose to enlarge 
the Tracking Frame when the person moves around frequently within a relatively 
confined area so that the person is always seen in the tracking camera image without 
moving the tracking camera. Later, if the person stops mo\ ;ng so frequenUy, the 

15 algorithm may choose to shrink the Tracking Frame to create a close-up image. 
Accordingly, the algorithm is then used to: 

Compute the smoothed y position of the top of the Bounding Box. Use 
the smoothed y as the top of Tracking Frame; 

Compute the smoothed width of the Bounding Box. Use the smoothed 
20 width as the width of the Tracking Frame; and 

Compute the smoothed x position of the center of the Bounding Box. U. 
the smoothed x as the desired horizontal center of the Tracking Frame. 

As pan of the smoothing, calculate a Confidence Measure of the 
difference for use later. This Confidence Measure may based on the number of Pixel 
25 Differences that are above the Current Threshold within the Bounding Box. The 

Confidence Measure may take into account the recent history of the Pixel Differences. 1 
there is a sudden, large increase in the number of Pixel Differences it is likely to be due 

to an event which is not a person moving, such as a change in room lighting. The 
Confidence Measure discounts such large changes unless they persist 

Likewise, a very small number of Pixel Differences after a persistent 
recent history of relatively large numbers of Pixel Differences is likely to be less 
important and is also discounted, unless they persist for a time. 

Processing Block 860: Drive Tracking Camera to the position defined by the 
Camera Frame. 

Even with the smoothing described in Processing Block 850, too many 
fine adjustments to racking camera 1 20 image may become a distraction to those 
watching. As depicted by HG. 12, hysteresis is introduced by selectively adding a fixed 
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rerccnmEe (the Extra Width Percenmge, typically 100%). to the width and height of the 
T X fIc .2,0, creating Ca.eta Pta.e 1205. Pan^-il. head 12. and Zoo^ocus 
subsystem .22 are driven to capmre Camera Frame 1205 wijh traclong camera 120. 

Most modons of the Tracking Frame which do not take .t outs.de the 
5 Camera Frame do no. cause the Camera Frame to move. Modons of a>e Tracking Frame 
^rch do take it ou.ide the Camera Frame cause the Camera Frame to move and hence 
^ Pan/nit Head .21 will be moved ,o poim .he tracking camera .20 approprra.ely (see 

When the Tracking Frame changes size so that Ae Camera Frame is no 
,0 longer the fixed pe,cenu>ge bigger than the Tracking Frame, the Camera Frame ,s 
a^ sted in size to make it conform and hence me zoom lens of ^oom/Focus 
subsystem .22 wiU be adjusted to change the magnification of the .mage captured by the 
mcking camera .20 appropriately (see below). 

Once d.e pan, tilt and zoom senings are determined, they are sent .0 the 
15 Pan/TUt/Zoom/Focus conBo..er 520 which in turn causes d.e tracking camera .20 .o 

respond ^^^^^^^y-^^^^^ p„,„„ 10. are both moving. d,e current 

amomatic focus technology often has difficulty getting a correct setdng. Automafc 
focusing of Peking camera 120 has been much more successful when acovated orrfy 
20 racking camera 120 has s.opped moving. T.US. if Camera Frarne 1205 has b^n 
moving and has now stopped and not moved for a short while (ty^.caUy, a second or 
nvo), processor 513 sends an AutoFocus command to the Zoom/Focus subsystem 122 v.a 

the Pan/Tilt/Zoom/Focus conffoUer 520. 

/>roc«iinsfi/oc*: 870. Select the Current Search Box. 

25 When the algorithm has a clear indication of where the person .s, the 

assumption is made ma. this is the person of interes. and there is no need to look 
anywhere else. Under this condition, the algorithm ignores other monons w"^" 
M^um search Box bu, distant from the person by shrinkmg the Cuaem ^e-h Box 
,0 closely surround me person as long as it keeps seeing the person movmg. If there .3 no 

30 modon the algorimm enlarges me Current Search Box looking for mouon. 

30 modon.th ^^^^ p^^, j,;„^,^„^^, 0,e Current Threshold, me edges of 

me Currem Search Box are moved toward the Camera Frame in steps (typically. 5* of 
me scan Line Lengm), as depicted in HG. 13; omerwise. me edges are moved toward me 
Maximum Search Box in steps, as iUusffated m RG. 14. 

Maxtm ^^^^ ^.^^ ^^^^^ ^^^^ ^^^^ ^ 

camera Frame and toward me Maximum Search Box need not be me same and may be 

installation dependent. 
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Processing Block 880: Decide Whether or not to Retain the Previous Images. 
The algorithm relies on the image-to-image difference providing enough 
motion to give a good idea of the person's position. The Confidence Measure expresses 
the algorithm's assessment of how strongly the algorithm believes it knows the person's 
5 current position. If the person is not moving much, the difference may not be very 
representative of a person and the Confidence Measure would be low. One way to 
increase the likelihood that there will be significant motion between images and a higher 
Confidence Measure is to wait longer between images, increasing the time interval 
between the Previous Images and the Current Image. 
10 In the illustrative embodiment of the invention, the Confidence Measure is 

based on the Difference Count within the Bounding Box relative to the Difference 
Counts seen in recent cycles of the algorithm. Other embodiments of the Confidence 
Measure are also possible. 

If the Confidence Measure is lower than the Minimum Confidence 
15 Measure the decision is then made to retain the Previous Images. If the Previous Image 
is retained, then the process continues at Processing Block 810. If the Previous Image is 
not retained, then the process continues at Processing Block 890. 

Also, if the Previous Images are retained for a long period of time, the 
high Confidence Measure may be due to having a Previous Image which contains a 
20 person within the Current Search Box and a Current Image which does not contain a 
person wathin the Current Search Box. (This could be the result of a person completely 
leaving the Current Search Box within the time between the capturing of the Previous 
Images and Current Image.) For this reason, if the Previous Images have not been 
updated for a long time, typically 10 seconds, the answer is set to *'No" and the process 
25 continues at Processing Block 890. 

Processing Block 890: Update Pre\ious Images. 

As seen in Processing Block 830, there may be more than one Previous 
Image. To update the Previous Images, discard the oldest Previous Image and make the 
next oldest Image the oldest previous image, and similarly until the Current Image is the 
30 most recent Previous Image. 



It is to be understood that the above-described embodiments are simply 
illustrative of the principles in accordance with the present invendon. Other 
embodiments may be readily devised by those skilled in the art which may embody the 
principles in spirit and scope. TTius, it is to be further understood that the circuit 
35 arrangements and concomitant methods described herein are not limited to the specific 
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forms shown by way of illustration, but may assume other embodiments limited only by 
the scope of the appended claims. 
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, " TT1:L ior .ac^n. a .ov,„s ob,ec, »UHin ,he fie,a of view of an 

eleconic camera system. *= method -"P"-^ 3°/ „^ ,3me« 
se<,ue„.a>.y generating -'^^^l^^^^^t^^.^.s or .He images. 

6 pixel infom,auon indicadve of movement oU^o^^U ^^^^^^^^^ ^^^^ 

determining a ^-2:^-::^^Z^:^^ of L o.,ect v,i.«„ the fie.d of 
8 frames being posiuoned to capture localizea m 

' viewofthee,cc.oniccam.a.^^^^^^^ 

] T"r.Uamovingo.iectwitHi.,..efie.dofviewofan 
: :::d— \";:^=rs r^n t. c.ent .age and one previo. 

::;:n=:r:=^^^^^^^^ 

previous image, ,^,v, reference to the bounding box, the trackinc 

(e) generating a tracking ^^ ^7^^-^^ the field of view of • 
frame being indicative of the locaaon of the moving oDj 

electronic camera system, detccdng the car." 

(f) determining a camera frame from the tracking rr 
frame with the electronic camera of the electronic camera system, 

7g) storing the current image as the last previous unage, and 

: -i:p7b;=;-"^^^^ 

e captured by the spotting camera during ^^^^^^^ .^f erence to the 

step (e) includes the step ^^^^ ^^ ,,,3tion of the moving ol. 
bounding box, the tracking frame being mdicanve ot 
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• K- the u^age of the spotting camera, and ^.eking frame 

9 Within the image ^ ,^ep of determining a earner 

step Cf) includes the s P ^^^^^^^g,,^e^^^^ of view of an 

— ^ '-™*:CrrJ"e=: .e cu.e« i.n..e and *e one 

^^^^^ 

ftan«tcmg.nd.cat.ve , and detecting O-e camera 

-""\"rnn.U.aacan.«a.a.e.on,.e.ac.ns.a.e 

n frame «i* *^ -^-^rrumage as *e -as. prev.ous image, and 

3 ' threshold value represen ^^^^^^^ ^^l^e. ^ 

n electromc camera sysvc ^ hv the spotting 

(c) determining pixei u 



3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 



3 

4 

5 
6 
7 
8 



wo 94/17636 



PCTRJS94/00866 



-19- 



10 (d) if the pixel differences are below a predetermined threshold, returning to step 

1 1 (b); otherwise, continuing with the next step, 

12 (e) determining a bounding box from the pixel differences, the boundmg box 

13 being indicative of the movement detected between the current image and the one 

14 previous image, , 

15 (f) generating a tracking frame with reference to the boundmg box, the traclong 

16 frame being indicative of the location of the moving object within the image of the 

17 spotting camera, 

,8 (gjdeteinuning a canKra frame ftom*e Peking frame and d«ectmg<he earner. 

19 frame with the tracking camera. 

20 (h) storing the current image as the last previous image, and 

21 (i) returning to step (b). 
8 The method as recited in claim 7 funher comprising the step, executed afte 

said step (g), of selecting a current search area from the image captured by the spottmg 
3 camera to locate the moving object 



1 
2 



I locate uic 111UVU15 v/i^jv^v. 

9 The method as recited in claim 7 further comprising the step, executed after 
said step (b). of determining a threshold box within the field of view to produce a cuiren 
. threshold value represenative of ambient image conditions, said pixel differences then 
4 computed with reference to the current threshold value. 

1 10 Circuitry for tracking a moving object within the field of view of an 

2 electronic camera system composed of a spotting camera and a tracking camera, the 

3 circuitry comprising . u • n 

4 means for sequentiaUy generating and for storing, as a previous unage. the ima. 

5 captured by the spotting camera during a scan interval and, as a current image, the imar 

6 capmred by the spotting camera during a subsequent scan mterval, 

7 means coupled to said means for sequentially generating and for storing, for 

8 detemiining the pixel differences between the current image and one previous image, 

9 means, coupled to said means for determining, for generating a boundmg box 

10 from the pixel differences, the bounding box being indicative of the movement detectc 

11 between the current image and the one previous image, 

12 means, coupled to said means for generating a bounding box, for generaung a 

13 tracking frame from the bounding box, the tracking frame being indicative of the loc. 

14 of the moving object within the image of the spotting camera. 

15 means, coupled to said means for generating a tracking frame, for detemumn, 

16 camera frame from the tracking firame. and 

17 means, coupled to said means for determining a camera frame, for capmnng 
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f. responsive 

, frame buffev, coupled ^^^^^ 

;.id pUel -"""t^^^^^U said pi-el *««e"ce^^^^^„,,„,i,, p.ocesso, 
^3 field of view of '^"^ J„„ of camera frames and. cone P ^^^^^^ ^^^^^ 

, for storing and execuong acacia 
said conuol signals as oa.p-«- 
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